Search CORE

11 research outputs found

A Collaborative Approach to Computational Reproducibility

Author: Capone Rebecca
Chirigati Fernando
Freire Juliana
Rampin Remi
Shasha Dennis
Publication venue
Publication date: 01/01/2016
Field of study

Although a standard in natural science, reproducibility has been only episodically applied in experimental computer science. Scientific papers often present a large number of tables, plots and pictures that summarize the obtained results, but then loosely describe the steps taken to derive them. Not only can the methods and the implementation be complex, but also their configuration may require setting many parameters and/or depend on particular system configurations. While many researchers recognize the importance of reproducibility, the challenge of making it happen often outweigh the benefits. Fortunately, a plethora of reproducibility solutions have been recently designed and implemented by the community. In particular, packaging tools (e.g., ReproZip) and virtualization tools (e.g., Docker) are promising solutions towards facilitating reproducibility for both authors and reviewers. To address the incentive problem, we have implemented a new publication model for the Reproducibility Section of Information Systems Journal. In this section, authors submit a reproducibility paper that explains in detail the computational assets from a previous published manuscript in Information Systems

arXiv.org e-Print Archive

Crossref

Directory of Open Access Journals

FigShare

Automatic Machine Learning by Pipeline Synthesis using Model-Based Reinforcement Learning and a Grammar

Author: Cho Kyunghyun
Drori Iddo
Freire Juliana
Krishnamurthy Yamuna
Lourenco Raoni
Rampin Remi
Silva Claudio
Publication venue
Publication date: 01/01/2019
Field of study

Automatic machine learning is an important problem in the forefront of machine learning. The strongest AutoML systems are based on neural networks, evolutionary algorithms, and Bayesian optimization. Recently AlphaD3M reached state-of-the-art results with an order of magnitude speedup using reinforcement learning with self-play. In this work we extend AlphaD3M by using a pipeline grammar and a pre-trained model which generalizes from many different datasets and similar tasks. Our results demonstrate improved performance compared with our earlier work and existing methods on AutoML benchmark datasets for classification and regression tasks. In the spirit of reproducible research we make our data, models, and code publicly available.Comment: ICML Workshop on Automated Machine Learnin

arXiv.org e-Print Archive

Open Repository and Bibliography - Luxembourg

AlphaD3M: An Open-Source AutoML Library for Multiple ML Tasks

Author: Castelo Sonia
DE PAULA LOURENCO Raoni
Freire Juliana
Ono Jorge
Rampin Remi
Santos Aécio
Silva Claudio
Publication venue
Publication date: 12/09/2023
Field of study

peer reviewedWe present AlphaD3M, an open-source Python library that supports a wide range of machine learning tasks over different data types. We discuss the challenges involved in supporting multiple tasks and how AlphaD3M addresses them by combining deep reinforcement learning and meta-learning to construct pipelines over a large collection of primitives effectively. To better integrate the use of AutoML within the data science lifecycle, we have built an ecosystem of tools around AlphaD3M that support user-in-the-loop tasks, including selecting suitable pipelines and developing custom solutions for complex problems. We present use cases that demonstrate some of these features. We report the results of a detailed experimental evaluation showing that AlphaD3M is effective and derives highquality pipelines for a diverse set of problems with performance comparable or superior to state-of-the-art AutoML systems

Open Repository and Bibliography - Luxembourg

AlphaD3M: Machine Learning Pipeline Synthesis

Author: Cho Kyunghyun
DE PAULA LOURENCO Raoni
Drori Iddo
Freire Juliana
Krishnamurthy Yamuna
Piazentin Ono Jorge
Rampin Remi
Silva Claudio
Publication venue
Publication date: 01/01/2021
Field of study

peer reviewedWe introduce AlphaD3M, an automatic machine learning (AutoML) system based on meta reinforcement learning using sequence models with self play. AlphaD3M is based on edit operations performed over machine learning pipeline primitives providing explainability. We compare AlphaD3M with state-of-the-art AutoML systems: Autosklearn, Autostacker, and TPOT, on OpenML datasets. AlphaD3M achieves competitive performance while being an order of magnitude faster, reducing computation time from hours to minutes, and is explainable by design

arXiv.org e-Print Archive

Open Repository and Bibliography - Luxembourg

Example ReproZip package -- Digit segmentation and recognition with scikit-learn and OpenCV

Author: Rampin Remi
Publication venue
Publication date: 01/01/2016
Field of study

New York University Faculty Digital Archive

Example ReproZip package -- Digit segmentation and recognition with scikit-learn and OpenCV

Author: Rampin Remi
Publication venue
Publication date: 01/01/2016
Field of study

ReproZip: 1.0.8

Author: Dennis Shasha
Fernando Chirigati
Juliana Freire
Remi Rampin
Vicky Steeves
Publication venue
Publication date
Field of study

Behavior changes: No longer default to overwriting trace directories. ReproZip will ask what to do or exit with an error if one of --continue/--overwrite is not provided Bugfixes: Fix an issue identifying Debian packages when a file's in two packages Fix Python error Mixing iteration and read methods would lose data Fix reprounzip info showing some numbers as 0 instead of hiding them in non-verbose mode Another fix to X server IP determination for Docker Enhancements: New GUI for reprounzip, allowing one to unpack without using the command-line Add filters to remove some common files types from packed files (.pyc) or detected input files (.py, .so, ...) Add JSON output format to reprounzip info Allow using the Virtualbox display to reproduce X11-enabled experiments Downloads: reprozip (tarball) reprounzip (wheel, tarball) reprounzip-docker (wheel, tarball) reprounzip-vagrant (wheel, tarball) reprounzip-vistrails (wheel, tarball) reprounzip-qt 0.1 (wheel, tarball) Windows installer (Python 2.7, reprounzip, plugins and GUI) Mac Installer (Python 2.7, reprounzip, plugins and GUI

ZENODO

uvcdat: UV-CDAT 2.6

The UV-CDAT team is pleased to announce the release of UV-CDAT version 2.6. DOI Change log is here Many thanks to users, testers, and developers for helping UV-CDAT to reach this milestone. This is a bug fix release, we have fixed several major and minor bugs in version 2.6 and therefore we strongly recommend users upgrade their UV-CDAT installation. From this release on UV-CDAT is distributed via conda conda install -c uvcdat uvcdat or conda create -n uvcdat-2.6 -c uvcdat uvcdat We also alert users to an Askbot website to help the UV-CDAT user community. This supports version 2.2 onward. See: http://uvcdat.askbot.co

ZENODO